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Abstract 

The  comparison  of  the  strength  of  association  between  a vari- 
able, X^,  and  each  of  two  potential  linear  predictors,  X2  and  X^, 
is  reexamined.  The  variances  of  X^  and  X^  are  nuisance  parameters, 
which  must  be  assumed  to  be  equal  in  the  procedure  recently  suggested 
by  Wolfe  [11].  In  this  note  a simple  modification  of  Wolfe's  test  is 
proposed.  The  use  of  ranks  allows  one  to  avoid  the  scale  problem. 

Key  words:  Rank  correlation.  Unequal  variances.  Association, 

Normal  scores 

1.  Introduction 

Let  ^2^,  Xj^)  i ■ 1,  ...,  n be  a random  sample  of  obser- 

vations from  a continuous  trivariate  distribution.  In  many  situations, 
we  are  interested  in  determining  which  of  X2  and  X^  is  more  strongly 
correlated  with  X^. 

Wolfe  [10]  showed  that  if  Var(X2)  ■ VarCX^)  then  the  correlation 
between  X^  and  X2  is  equal  to  that  between  X^  and  X^  if  and  only  if 
X^  and  Z ■ X^  - X2  are  uncorrelated.  Subsequently,  the  same  author 
dl]  proposed  a distribution-free  procedure  for  detecting  a difference 
between  the  two  correlations. 

Research  partially  supported  by  ONR  Contracts  N00014-75-C-0439  and 
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I 


2 


Tha  procedure  was  exemplified  for  a set  of  heart  disease  data  with 
the  significant  indication  that  is  more  positively  related  to 
than  is  X^^ 

Examination  of  the  data  suggests  otherwise,  since  r^^ 
greater  than  r^^,  where  r^^  is  tha  sample  product  moment  correlation 
coefficient  between  X^  and  X^.  The  problem  is  that  the  equality  of 
variance  assumption  evidently  does  not  hold  for  these  data  and  that 
this  assumption  is  clearly  crucial  in  the  inferential  process. 

In  Section  3 an  alternative  procedure  is  proposed  which 
eliminates  the  scale  problem.  The  method  is  applied  to  the  same 
heart  disease  data  from  [11]  and  quite  a different  conclusion  is 
reached. 

2.  Related  Correlation  Coefficients 

Let  X^,  X^f  X^  have  a continuous  trivariate  distribution  with 
covariance  matrix  E.  Let  element  of  E, 

with  1 ■ 1»2,3).  For  the  trivariate  normal  the  problem 

of  testing  bean  discussed  by  Hotelling  [7]  and 

more  recently  by  Dunn  and  Clark  [3,4].  A distribution  free  approach 
relies  upon  observations  by  Wolfe  (10)  that  the  correlation  between 
X^  and  Z is  given  by 


^1*  • TT"!  2 ! 7172 

OlCOj  + O3  - 2P23O2O3) 

and  thus  p^^  > 0 implies  * 03^3,  and  in  fact 
only  if  02  **  ‘’3’  This  restriction  on  and  O3 
for  Kendall's  ^ ^ reasonably  imply  that  p 


(1) 

Pl2  Pj^3  follows 
is  also  necessary 
3^2  ^ Pi2*  Consider 


the  special  case  of  joint  normality  for  which  T ■ 2/rr  arcsin{p). 
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Thus,  from  (1)  and  the  fact  that  t > 0 if  and  only  if  p > 0,  we  see 
that  • 0 does  not  imply  either  ^13  “ ^12'  only 

that  * ®2^3»  Similarly,  > 0 does  not  preclude  either 

^3  " h2  ‘’13  " ‘’12* 

The  assumption  of  equal  variances  for  and  is  suspect  for 

the  data  analyzed  by  Wolfe  [11].  A sample  value  T,  of  .35  is 

xz 

obtained,  and  viewed  as  a significant  indication  that  X^  is  more 

positively  related  to  X^^  than  is  Xj.  The  data  do  not  support  that 

conclusion,  when  the  measure  of  "positively  related"  is  any  of  the 

standard  measures  of  association.  For  instance  r^^  * .673  but 

" *511  emd  Tj^^  " >558  but  Tj^^  ■ .400.  The  sample  covarieuices , 

■ 280.56  and  Sj^^  ■ 939.33,  are  certainly  consistent  with  the 

inference  that  > 02,2*  However,  the  magnitudes  of  the  seunple 
2 2 

variances,  s^  ■ 4.25  and  s^  ■ 82.56,  suggest  that  it  is  not  un- 


reasonable for  the  sense  of  the  Inequality  to  be  reversed  for  p 
and  p^2* 
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3.  Procedure  for  Unequal  Variance 
A number  of  modified  procedures  are  available,  all  in  the 
spirit  of  the  method  put  forth  by  Wolfe.  One  approach  involves 
scoring  the  X^^^  and  x^^^  with  an  order  preserving  transformation  that 
will  circumvent  the  scale  problem.  Replacing  each  of  the  X^  and  X^ 
vectors  by  their  integer  ranks,  RCX^^^)  and  R(X22^),  is  one  possibility. 
The  problem  that  arises  here  is  that  the  Z|  > R(X^2^)  - would 

necessarily  involve  a substantial  number  of  ties.  Replacing  X^^^  and 
X^2^  by  their  expected  normal  scores  will  reduce  the  magnitude  of  this 
problem  and  still  eliminate  the  scale  problem. 
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Tharafore,  the  suggested  procedure  Is  to  replace  the  and 

X,.  by  the  corresponding  expected  normal  scores,  a,  ■ E[Z,. 

31  1 (i,n) 

emd  use  any  of  the  usual  nonparametric  measures  of  correlation  to 
detect  association  between  the  X^^  2md  the  differences  of  the 
scores,  ■ a(X^^)  - a(X2^).  Because  there  is  not  a f^uniliar 
population  quantity  (in  terms  of  the  original  parameters)  corre- 
sponding to  the  relationship  between  X^  and  Z',  we  shall  not  attempt 
to  state  formal  hypotheses.  Nevertheless,  the  technique  allows  one 
to  make  general  inferences  about  the  relationships  of  X^  with  X^ 

2md  X^.  Note  that  this  same  difficulty  accompanied  the  technique 
proposed  by  Wolfe  [11].  Using  Wolfe's  test  one  is  able  to  infer 
that  X^  and  Z are  positively  (negatively)  related.  The  difficulty 
arises  in  extending  the  knowledge  of  the  relationship  between  X^ 
and  Z,  to  )cnowledge  ^dx^ut  the  strength  of  the  relationship  between 
Xj^  and  X^  relative  to  that  between  X^  and  X^. 

Since  ties  within  the  X^^  and  X^^  are  a problem  that  may  be 
encountered,  a consistent  method  for  dealing  with  them  is  proposed. 
Averaging  the  scores  for  the  tied  values  and  then  rescaling  by  a 
function  of  the  total  sxim  of  squares  is  a scheme  which  l3oth 
remains  consistent  with  the  midrwking  procedure,  and  maintains 
the  equal  variance  property.  If  the  normal  scores  are  denoted  by 

{a^}  and  the  midranked  set  by  {a*}  then  the  proposed  scores  are 

2 2 J./2 

given  by  a|  - aJtEa^  / Ea*  j . Taible  1 reflects  the  application 
of  this  rule  due  to  ties  among  the  X^^^  and  X^^.  If  there  are  no 
ties  then  aj^  = a^  and  the  equal  variance  requirement  is  satisfied. 
The  scores  (a^)  and  their  sum  of  squares  are  tabled  in  several 


places,  e.g.  Owen  [g]  p.151. 
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Table  1.  Heart  Disease  Data 
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1 

I 

I 

6 

After  making  these  adjustments  we  obtain  values  of  -.183 
for  , Md  -.294  for  the  Spearman  rank  correlation  coefficient. 

Both  of  these  values  indicate  that  is  more  strongly  correlated 

with  than  with  X^,  which  is  consistent  with  the  values  mentioned  | 

in  Section  2. 

4.  Alternate  Approaches  and  a Related  Problem  I 

The  test  procedures  discussed  in  this  note  are  formulated  to 

operate  reliably  in  the  presence  of  unequal  variances.  A second 

but  less  attractive  alternative  procedure  is  to  perform  a pre- 

2 2 

liminary  test  of  and  use  Wolfe's  test  if  that  hypothesis 

is  not  rejected.  Noting  that  X^  and  X^  are  not  independent,  and 
that  standard  nonparametric  methods  for  the  paired-sample  scale 
problem  appear  not  to  be  well  known,  a simple  method  is  given  here. 

For  bivariate  normal  pairs  the  solution  is  obtainable  from  a result 
due  to  Pitman  (see  Kendall  and  Stuart  [8]  pp.l39  and  531). 

Let  (X,  Y)  be  a bivariate  random  variable  with  finite  second 
moments.  It  is  easily  shown  that  the  sum  and  difference  are  un- 
correlated if  and  only  if  the  two  variances  are  equal.  Hence, 

2 2 

letting  U » X + Y and  V ■ X - Y ^md  noting  that  Cov(U,  ~ '^y 

it  may  be  seen  that  a significant  indication  of  positive  (negative) 

2 

correlation  between  U and  V implies  that  is  greater  (less)  than 

2 2 2 
a . Therefore  a test  of  H : a * o can  be  based  upon  a r2mk 
y o X y 

correlation  statistic.  For  the  heart  disease  data  in  Table  1 define 
- X^^  + and  - X^^  - Xj^  (i  - 1,  ...,  16).  The  Spearman 
rank  correlation  coefficient  for  the  bivariate  sample  {(U^,  V^)}  is 


2 2 

.968,  a highly  significant  indication  that  Thus  Wolfe's 

test  is  not  appropriate  emd  a modified  procedure  such  as  that  pro- 
posed in  Section  3 is  indicated. 

Still  emother  approach  would  be  to  standardize  the  and 

values,  dividing  by  their  individual  sample  stemdard  deviations. 

A case  can  be  made  for  the  legitimacy  of  treating  the  differences 
X,,  X_. 

•7"  - 

^ i * '~Q  * "*1^*  f eeefil 

in  the  same  fashion  that  Wolfe  treats  the  by  appealing  to  a 
multivariate  extension  of  the  Theorem  of  Fligner,  Hogg  and  Killeen 
(51  to  establish  the  exchangeability  of  the  Z^.  Proceeding  formally 
with  this  approach  yields  rank  correlations  between  the  X^^^  and  Z^ 
that  are  in  close  agreement  with  the  results  obtained  using  normal 
scores,  niunely,  -.244  for  Spearman's  Md  -.150  for  Kendall's. 

Next  the  asymptotically  distribution-free  test  proposed  by 
Davis  and  Quade  [1]  may  also  be  adapted  to  this  problem.  This 
approach  relies  upon  the  large  sample  normality  of  the  U-statistic, 
which  is  simply  the  difference  of  the  two  Kendall  rank  correlation 
coefficients  Tj^2  ~ ’^13  • observed  value  of  this  difference  is 

.158  with  an  estimated  st^u:ldard  deviation  of  .121  and  hence  is 
significwt  (p  < .10)  in  the  direction  opposite  of  that  implied 
by  Wolfe's  test. 

Finally,  the  jackknife  method  should  be  mentioned  as  a second 
asymptotically  robust  technique.  The  results  of  Duncem  and  Layard 
(2)  suggest  that  a highly  satisfactory  approach  would  be  to  jack- 
knife the  difference  of  Fisher's  2 transformation  applied  to  each  of 
rj^2  The  n pseudovalues  arise  from  omitting  each  of  the 
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trivariate  cases  one  at  a time.  For  a modification  to  the  degrees  of 
freedom  of  the  approximate  Student  t distribution  see  Hinkley  [6]. 

5.  Summary 

The  method  recently  proposed  by  Wolfe  [11]  is  modified  in 
this  note  to  perform  in  the  presence  of  unequal  variances.  In 
considering  the  equal  variance  assumption  a simple  rank  correlation 
test  is  proposed  for  the  paired-sample  scale  problem.  While  there 
are  a variety  of  procedures  for  comparing  two  related  correlation 
coefficients,  additional  work  is  needed  to  extend  them  to  the  case 
of  several  coefficients.  To  assess  the  relative  efficiencies  of 
the  various  available  methods  under  a variety  of  realistic  joint 
distributions  a Monte  Carlo  study  is  probably  required. 
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